joint behavior
- North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
- Asia > Japan > Honshū > Chūbu > Toyama Prefecture > Toyama (0.04)
- North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
- Asia > Japan > Honshū > Chūbu > Toyama Prefecture > Toyama (0.04)
Joint Behavior and Common Belief
Friedenberg, Meir, Halpern, Joseph Y.
The past few years have seen an uptick of interest in studying cooperative AI, that is, AI systems that are designed to be effective at cooperating. Indeed, a number of influential researchers recently argued that "[w]e need to build a science of cooperative AI... progress towards socially valuable AI will be stunted unless we put the problem of cooperation at the centre of our research" [6]. One type of cooperative behavior is joint behavior, that is, collaboration scenarios where the success of the joint action is dependent on all agents doing their parts; one agent deviating can cause the efforts of others to be ineffective. The notion of joint behavior has been studied (in much detail) under various names such as "acting together", "teamwork", "collaborative plans", and "shared plans", and highly influential models of it were developed (see, e.g., [2, 4, 10, 11, 15, 24]). Efforts were also made to engineer some of these theories into real-world joint planning systems [23, 20].
- North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
- North America > Canada > Ontario > Middlesex County > London (0.04)
- Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)
PMIC: Improving Multi-Agent Reinforcement Learning with Progressive Mutual Information Collaboration
Li, Pengyi, Tang, Hongyao, Yang, Tianpei, Hao, Xiaotian, Sang, Tong, Zheng, Yan, Hao, Jianye, Taylor, Matthew E., Tao, Wenyuan, Wang, Zhen, Barez, Fazl
Learning to collaborate is critical in Multi-Agent Reinforcement Learning (MARL). Previous works promote collaboration by maximizing the correlation of agents' behaviors, which is typically characterized by Mutual Information (MI) in different forms. However, we reveal sub-optimal collaborative behaviors also emerge with strong correlations, and simply maximizing the MI can, surprisingly, hinder the learning towards better collaboration. To address this issue, we propose a novel MARL framework, called Progressive Mutual Information Collaboration (PMIC), for more effective MI-driven collaboration. PMIC uses a new collaboration criterion measured by the MI between global states and joint actions. Based on this criterion, the key idea of PMIC is maximizing the MI associated with superior collaborative behaviors and minimizing the MI associated with inferior ones. The two MI objectives play complementary roles by facilitating better collaborations while avoiding falling into sub-optimal ones. Experiments on a wide range of MARL benchmarks show the superior performance of PMIC compared with other algorithms.
- North America > Canada > Alberta (0.14)
- Asia > China > Tianjin Province > Tianjin (0.04)
- North America > United States > New York > New York County > New York City (0.04)
- (3 more...)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.46)
Beyond Rewards: a Hierarchical Perspective on Offline Multiagent Behavioral Analysis
Omidshafiei, Shayegan, Kapishnikov, Andrei, Assogba, Yannick, Dixon, Lucas, Kim, Been
Each year, expert-level performance is attained in increasingly-complex multiagent domains, where notable examples include Go, Poker, and StarCraft II. This rapid progression is accompanied by a commensurate need to better understand how such agents attain this performance, to enable their safe deployment, identify limitations, and reveal potential means of improving them. In this paper we take a step back from performance-focused multiagent learning, and instead turn our attention towards agent behavior analysis. We introduce a model-agnostic method for discovery of behavior clusters in multiagent domains, using variational inference to learn a hierarchy of behaviors at the joint and local agent levels. Our framework makes no assumption about agents' underlying learning algorithms, does not require access to their latent states or policies, and is trained using only offline observational data. We illustrate the effectiveness of our method for enabling the coupled understanding of behaviors at the joint and local agent level, detection of behavior changepoints throughout training, discovery of core behavioral concepts, demonstrate the approach's scalability to a high-dimensional multiagent MuJoCo control domain, and also illustrate that the approach can disentangle previously-trained policies in OpenAI's hide-and-seek domain.
- North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
- Asia > Japan > Honshū > Chūbu > Toyama Prefecture > Toyama (0.04)